AITopics | gender bias

Collaborating Authors

gender bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gender Bias in Emotion Recognition by Large Language Models

Herbert, Maureen, Sun, Katie, Lim, Angelica, Etesam, Yasaman

arXiv.org Artificial IntelligenceNov-26-2025

The rapid advancement of large language models (LLMs) and their growing integration into daily life underscore the importance of evaluating and ensuring their fairness. In this work, we examine fairness within the domain of emotional theory of mind, investigating whether LLMs exhibit gender biases when presented with a description of a person and their environment and asked, "How does this person feel?". Furthermore, we propose and evaluate several debiasing strategies, demonstrating that achieving meaningful reductions in bias requires training based interventions rather than relying solely on inference-time prompt-based approaches such as prompt engineering.

emotion label, large language model, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2511.19785

Country: North America (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

GenderBench: Evaluation Suite for Gender Biases in LLMs

Pikuliak, Matúš

arXiv.org Artificial IntelligenceMay-20-2025

We present GenderBench -- a comprehensive evaluation suite designed to measure gender biases in LLMs. GenderBench includes 14 probes that quantify 19 gender-related harmful behaviors exhibited by LLMs. We release GenderBench as an open-source and extensible library to improve the reproducibility and robustness of benchmarking across the field. We also publish our evaluation of 12 LLMs. Our measurements reveal consistent patterns in their behavior. We show that LLMs struggle with stereotypical reasoning, equitable gender representation in generated texts, and occasionally also with discriminatory behavior in high-stakes scenarios, such as hiring.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.12054

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

From Structured Prompts to Open Narratives: Measuring Gender Bias in LLMs Through Open-Ended Storytelling

Chen, Evan, Zhan, Run-Jun, Lin, Yan-Bai, Chen, Hung-Hsuan

arXiv.org Artificial IntelligenceMar-20-2025

Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases present in their training data. This study introduces a novel evaluation framework to uncover gender biases in LLMs, focusing on their occupational narratives. Unlike previous methods relying on structured scenarios or carefully crafted prompts, our approach leverages free-form storytelling to reveal biases embedded in the models. Systematic analyses show an overrepresentation of female characters across occupations in six widely used LLMs. Additionally, our findings reveal that LLM-generated occupational gender rankings align more closely with human stereotypes than actual labor statistics. These insights underscore the need for balanced mitigation strategies to ensure fairness while avoiding the reinforcement of new stereotypes.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.15904

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York > New York County > New York City (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(11 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

A database to support the evaluation of gender biases in GPT-4o output

Mehner, Luise, Fiedler, Lena Alicija Philine, Ammon, Sabine, Kolossa, Dorothea

arXiv.org Artificial IntelligenceFeb-28-2025

The widespread application of Large Language Models (LLMs) involves ethical risks for users and societies. A prominent ethical risk of LLMs is the generation of unfair language output that reinforces or exacerbates harm for members of disadvantaged social groups through gender biases (Weidinger et al., 2022; Bender et al., 2021; Kotek et al., 2023). Hence, the evaluation of the fairness of LLM outputs with respect to such biases is a topic of rising interest. To advance research in this field, promote discourse on suitable normative bases and evaluation methodologies, and enhance the reproducibility of related studies, we propose a novel approach to database construction. This approach enables the assessment of gender-related biases in LLM-generated language beyond merely evaluating their degree of neutralization.

category, gender bias, standpoint theory, (13 more...)

arXiv.org Artificial Intelligence

2502.20898

Country:

North America > United States (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Assessing Gender Bias in LLMs: Comparing LLM Outputs with Human Perceptions and Official Statistics

Bas, Tetiana

arXiv.org Artificial IntelligenceNov-20-2024

This study investigates gender bias in large language models (LLMs) by comparing their gender perception to that of human respondents, U.S. Bureau of Labor Statistics data, and a 50% no-bias benchmark. We created a new evaluation set using occupational data and role-specific sentences. Unlike common benchmarks included in LLM training data, our set is newly developed, preventing data leakage and test set contamination. Five LLMs were tested to predict the gender for each role using single-word answers. We used Kullback-Leibler (KL) divergence to compare model outputs with human perceptions, statistical data, and the 50% neutrality benchmark. All LLMs showed significant deviation from gender neutrality and aligned more with statistical data, still reflecting inherent biases.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.13738

Country: North America > United States (0.88)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Gender Bias of LLM in Economics: An Existentialism Perspective

Zhong, Hui, Chen, Songsheng, Liang, Mian

arXiv.org Artificial IntelligenceOct-13-2024

Large Language Models (LLMs), such as GPT-4 and BERT, have rapidly gained traction in natural language processing (NLP) and are now integral to financial decision-making. However, their deployment introduces critical challenges, particularly in perpetuating gender biases that can distort decision-making outcomes in high-stakes economic environments. This paper investigates gender bias in LLMs through both mathematical proofs and empirical experiments using the Word Embedding Association Test (WEAT), demonstrating that LLMs inherently reinforce gender stereotypes even without explicit gender markers. By comparing the decision-making processes of humans and LLMs, we reveal fundamental differences: while humans can override biases through ethical reasoning and individualized understanding, LLMs maintain bias as a rational outcome of their mathematical optimization on biased data. Our analysis proves that bias in LLMs is not an unintended flaw but a systematic result of their rational processing, which tends to preserve and amplify existing societal biases encoded in training data. Drawing on existentialist theory, we argue that LLM-generated bias reflects entrenched societal structures and highlights the limitations of purely technical debiasing methods. This research underscores the need for new theoretical frameworks and interdisciplinary methodologies that address the ethical implications of integrating LLMs into economic and financial decision-making. We advocate for a reconceptualization of how LLMs influence economic decisions, emphasizing the importance of incorporating human-like ethical considerations into AI governance to ensure fairness and equity in AI-driven financial systems.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.19775

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Guangdong Province > Zhuhai (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Banking & Finance > Financial Services (0.68)
Law > Civil Rights & Constitutional Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

'Since Lawyers are Males..': Examining Implicit Gender Bias in Hindi Language Generation by LLMs

Joshi, Ishika, Gupta, Ishita, Dey, Adrita, Parikh, Tapan

arXiv.org Artificial IntelligenceSep-20-2024

Large Language Models (LLMs) are increasingly being used to generate text across various languages, for tasks such as translation, customer support, and education. Despite these advancements, LLMs show notable gender biases in English, which become even more pronounced when generating content in relatively underrepresented languages like Hindi. This study explores implicit gender biases in Hindi text generation and compares them to those in English. We developed Hindi datasets inspired by WinoBias to examine stereotypical patterns in responses from models like GPT-4o and Claude-3 sonnet. Our results reveal a significant gender bias of 87.8% in Hindi, compared to 33.4% in English GPT-4o generation, with Hindi responses frequently relying on gender stereotypes related to occupations, power hierarchies, and social class. This research underscores the variation in gender biases across languages and provides considerations for navigating these biases in generative AI systems.

language model, query, stereotype, (15 more...)

arXiv.org Artificial Intelligence

2409.13484

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > India > NCT > New Delhi (0.04)
Europe > Switzerland (0.04)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Education (1.00)
Law (0.93)
Social Sector (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unveiling Gender Bias in Large Language Models: Using Teacher's Evaluation in Higher Education As an Example

Huang, Yuanning

arXiv.org Artificial IntelligenceSep-15-2024

This paper investigates gender bias in Large Language Model (LLM)-generated teacher evaluations in higher education setting, focusing on evaluations produced by GPT-4 across six academic subjects. By applying a comprehensive analytical framework that includes Odds Ratio (OR) analysis, Word Embedding Association Test (WEAT), sentiment analysis, and contextual analysis, this paper identified patterns of gender-associated language reflecting societal stereotypes. Specifically, words related to approachability and support were used more frequently for female instructors, while words related to entertainment were predominantly used for male instructors, aligning with the concepts of communal and agentic behaviors. The study also found moderate to strong associations between male salient adjectives and male names, though career and family words did not distinctly capture gender biases. These findings align with prior research on societal norms and stereotypes, reinforcing the notion that LLM-generated text reflects existing biases.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.09652

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.93)
Education > Educational Setting > Higher Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Are models biased on text without gender-related language?

AIHubJul-9-2024, 09:07:35 GMT

Have you ever needed to write a recommendation letter for a student or co-worker and struggled to remember their contributions? Or maybe you're applying for a job or internship and don't know how to start your cover letter. Language Models (LMs), like ChatGPT [1], have become useful tools for writing various types of content, including professional documents like recommendation or cover letters [2, 3]. By providing just a few details, such as the person's name, age, gender, and current position, you can quickly generate initial drafts of these documents. With ChatGPT, users can quickly create an initial draft of a reference letter by specifying a few details about the individual.

computational linguistic, non-stereotypical sentence, stereotype, (14 more...)

AIHub

Country:

Europe > Austria > Vienna (0.15)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
Asia > Singapore (0.05)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)

Add feedback

Surprising gender biases in GPT

Fulgu, Raluca Alexandra, Capraro, Valerio

arXiv.org Artificial IntelligenceJul-8-2024

We present seven experiments exploring gender biases in GPT. Initially, GPT was asked to generate demographics of a potential writer of twenty phrases containing feminine stereotypes and twenty with masculine stereotypes. Results show a strong asymmetry, with stereotypically masculine sentences attributed to a female more often than vice versa. For example, the sentence "I love playing fotbal! Im practicing with my cosin Michael" was constantly assigned by ChatGPT to a female writer. This phenomenon likely reflects that while initiatives to integrate women in traditionally masculine roles have gained momentum, the reverse movement remains relatively underdeveloped. Subsequent experiments investigate the same issue in high-stakes moral dilemmas. GPT-4 finds it more appropriate to abuse a man to prevent a nuclear apocalypse than to abuse a woman. This bias extends to other forms of violence central to the gender parity debate (abuse), but not to those less central (torture). Moreover, this bias increases in cases of mixed-sex violence for the greater good: GPT-4 agrees with a woman using violence against a man to prevent a nuclear apocalypse but disagrees with a man using violence against a woman for the same purpose. Finally, these biases are implicit, as they do not emerge when GPT-4 is directly asked to rank moral violations. These results highlight the necessity of carefully managing inclusivity efforts to prevent unintended discrimination.

gender, gpt-4, nuclear apocalypse, (15 more...)

arXiv.org Artificial Intelligence

2407.06003

Country:

Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
Europe > Italy > Lombardy > Milan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Health & Medicine (1.00)
Law (0.69)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback